[output] Adding new output for PagerDuty Incidents #467

javuto · 2017-11-10T18:21:21Z

to: @ryandeivert @jacknagz
cc: @airbnb/streamalert-maintainers
size: medium

Background

Migrating to the PagerDuty REST API helps to simplify the code and utilize features only present in v2. You can read up on more advantages of the migration to the new API here.
After adding support for the PagerDuty Events API v2, next step is to be able to create Incidents using the API.

Changes

Adding support for PagerDuty Incidents API v2 (REST API) as output. API reference for creating incidents..
Added tests for new output.
Adding context field as decorator in the alert, as RuleAttributes field. This allows specific rules to add some context to the alert record to be used in places like the outputs.
Added documentation for new field for rule context.

Testing

$ ./tests/scripts/unit_tests.sh
...
TOTAL                                            2388     81    97%
----------------------------------------------------------------------
Ran 425 tests in 8.940s

OK

$ ./tests/scripts/pylint.sh

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

$ python manage.py lambda test --processor rule all
...
StreamAlertCLI [INFO]: (59/59) Successful Tests
StreamAlertCLI [INFO]: (96/96) Alert Tests Passed
StreamAlertCLI [INFO]: Completed

coveralls · 2017-11-10T18:24:46Z

Coverage decreased (-0.2%) to 95.445% when pulling 6c22135 on javier-streamalert-pagerduty-incidents into 6ae211b on master.

coveralls · 2017-11-10T18:28:26Z

Coverage decreased (-0.2%) to 95.445% when pulling 0842ddf on javier-streamalert-pagerduty-incidents into e75d559 on master.

jacknagz

awesome job, first round of comments!

jacknagz · 2017-11-10T22:45:06Z

docs/source/rules.rst

+context
+~~~~~~~~~~~
+
+``context`` is an optional argument which defines an extra field of information to pass on inside


Can probably be simplified to something like: context is an optional field to pass extra instructions to the alert processor on how to route the alert.

jacknagz · 2017-11-10T22:45:53Z

stream_alert/alert_processor/outputs.py

@@ -13,6 +13,7 @@
 See the License for the specific language governing permissions and
 limitations under the License.
 """
+# pylint: disable=too-many-lines


hah we need to start splitting these out

^^ @jacknagz I told him to do this for now. I also told @armtash to do it for his JIRA output stuff. I will break this out after these merge

jacknagz · 2017-11-10T22:46:56Z

stream_alert/alert_processor/outputs.py

+        Returns:
+            dict: Contains various default items for this output (ie: url)
+        """
+        return {


nit: can this be a single line?

jacknagz · 2017-11-10T22:47:54Z

stream_alert/alert_processor/outputs.py

+    def dispatch(self, **kwargs):
+        """Send incident to Pagerduty Incidents API v2
+
+        Args:


you can just declare Keyword Args instead of Args

jacknagz · 2017-11-10T22:48:54Z

stream_alert/alert_processor/outputs.py

+            return self._log_status(False)
+
+        # Extracting context data to assign the incident
+        rule_context = kwargs['alert']['context'][self.__service__]


what if context isn't in the alert? does it always need to be defined for this output?

or rather, what if it's an empty map?

to jack's point, the default for this will be an empty dict, given this change: https://github.com/airbnb/streamalert/pull/467/files#diff-30cc622fccfff0217d84af2ff5bcd7ffR74

Even it is defaulted to an empty dict, I will add a test for that because the output should just take the default escalation policy if no context is passed.

jacknagz · 2017-11-10T22:50:19Z

stream_alert/alert_processor/outputs.py

+            policy_to_assign = creds['escalation_policy']
+
+        policies_url = os.path.join(creds['api'], self.POLICIES_ENDPOINT)
+        policy_id = self._check_exists_get_id(policy_to_assign,


nit: use consistent line breaks

jacknagz · 2017-11-10T22:53:33Z

stream_alert/alert_processor/outputs.py

+            'details': ''
+        }
+        # We need to get the service id from the API
+        services_url = os.path.join(creds['api'], self.SERVICES_ENDPOINT)


these two lines seem to be repeated in this function, can this be extracted into a helper?

jacknagz · 2017-11-10T22:56:55Z

tests/unit/stream_alert_alert_processor/test_outputs.py

@@ -268,6 +268,144 @@ def test_dispatch_bad_descriptor(self, log_error_mock):

        log_error_mock.assert_called_with('Failed to send alert to %s', self.__service)

+class TestPagerDutyIncidentOutput(object):


Add a test case for an empty context arg

jacknagz · 2017-11-10T22:57:41Z

stream_alert/alert_processor/outputs.py

+        # If there are results, get the first occurence from the list
+        return response and response.get(target_key)[0]['id']
+
+    def dispatch(self, **kwargs):


Please add an expected schema for the context either here or in the class docstring

ryandeivert

first round of comments

ryandeivert · 2017-11-10T23:25:24Z

stream_alert/alert_processor/outputs.py

@@ -13,6 +13,7 @@
 See the License for the specific language governing permissions and
 limitations under the License.
 """
+# pylint: disable=too-many-lines


^^ @jacknagz I told him to do this for now. I also told @armtash to do it for his JIRA output stuff. I will break this out after these merge

ryandeivert · 2017-11-10T23:29:33Z

stream_alert/alert_processor/outputs.py

+            return self._log_status(False)
+
+        # Extracting context data to assign the incident
+        rule_context = kwargs['alert']['context'][self.__service__]


to jack's point, the default for this will be an empty dict, given this change: https://github.com/airbnb/streamalert/pull/467/files#diff-30cc622fccfff0217d84af2ff5bcd7ffR74

ryandeivert · 2017-11-10T23:30:31Z

stream_alert/alert_processor/outputs.py

+                }]
+            # If the user retrieval did not succeed, default to policies
+            else:
+                user_to_assign = False


Remove this else and add the default value to the get(..) above:

user_to_assign = rule_context.get('assigned_user', False)

ryandeivert · 2017-11-10T23:33:29Z

stream_alert/alert_processor/outputs.py

+
+        # Start preparing the incident JSON blob to be sent to the API
+        incident_title = 'StreamAlert Incident - Rule triggered: {}'.format(kwargs['rule_name'])
+        incident_body = {


are the type and details keys within this required? It looks like these values never get set, and this is just passed as the `body in the incident payload

If these are required, can you add a comment saying so and why?

They are required, I just forgot about them :)

ryandeivert · 2017-11-10T23:34:31Z

stream_alert/alert_processor/outputs.py

+            'id': service_id,
+            'type': 'service_reference'
+        }
+        incident_priority = {


I'd be a fan of just moving this dict and the above dict right into the structure below for the incident.

I was digging into the API docs and it turns out that this is just optional, so maybe something else to provide via context.

ryandeivert · 2017-11-10T23:36:01Z

stream_alert/alert_processor/outputs.py

+        if not user_to_assign and rule_context.get('assigned_policy'):
+            policy_to_assign = rule_context.get('assigned_policy')
+        else:
+            policy_to_assign = creds['escalation_policy']


move this above the else so it's set by default and then just set it, like so:

policy_to_assign = creds['escalation_policy'] if not user_to_assign and rule_context.get('assigned_policy'): policy_to_assign = rule_context.get('assigned_policy')

one less line :)

ryandeivert · 2017-11-10T23:54:58Z

stream_alert/alert_processor/outputs.py

+        # If there are results, get the first occurence from the list
+        return response and response.get(target_key)[0]['id']
+
+    def dispatch(self, **kwargs):


this dispatch function is huge - can you move some of it to functions to make it more readable and make testing easier

coveralls · 2017-11-13T23:21:25Z

Coverage decreased (-0.4%) to 95.283% when pulling f2ca160 on javier-streamalert-pagerduty-incidents into b1966aa on master.

ryandeivert · 2017-11-14T01:28:09Z

tests/unit/stream_alert_alert_processor/helpers.py

+        index (int): test_index value (0 by default)
+        context(dict): context dictionary (empty by default)
+    """
+    if not context:


this can be simplified to context = context or {}

coveralls · 2017-11-14T15:34:26Z

Coverage decreased (-0.2%) to 95.404% when pulling 33c0f77 on javier-streamalert-pagerduty-incidents into b1966aa on master.

coveralls · 2017-11-14T19:05:34Z

Coverage decreased (-0.2%) to 95.455% when pulling eb17826 on javier-streamalert-pagerduty-incidents into 71a51cf on master.

coveralls · 2017-11-15T16:57:13Z

Coverage decreased (-0.2%) to 95.455% when pulling ac63b6b on javier-streamalert-pagerduty-incidents into bd8b80f on master.

ryandeivert

A few more comments - this is looking great, thanks for addressing everything so far :)

ryandeivert · 2017-11-15T18:26:47Z

stream_alert/alert_processor/outputs.py

+            return self._log_status(False)
+
+        # Preparing headers for API calls
+        headers = {


suggestion: just set self.headers right here (or self._headers once the instance properties are made protected)

ryandeivert · 2017-11-15T18:28:26Z

stream_alert/alert_processor/outputs.py

+
+    def __init__(self, *args, **kwargs):
+        StreamOutputBase.__init__(self, *args, **kwargs)
+        self.base_url = None


Maybe make these instance properties protected? ie - self._headers, etc.. since they should never be accessed from outside an instance of this class.

ryandeivert · 2017-11-15T18:30:27Z

stream_alert/alert_processor/outputs.py

+
+        # If there are results, get the first occurence from the list
+        target = response.get(target_key, False)
+        if response and target:


Suggestion: returning early if there is ever the opportunity to do so, instead of having to do compound if checks throughout a function helps with decrease confusion and increase readability. For instance, this could be make more readable via:

response = resp.json() if not response: return False # Maybe even add a LOGGER.error(...) here (?) # If there are results, get the first occurence from the list target = response.get(target_key, False) if not target: return False # Maybe add a LOGGER.error(...) here too (?) return target[0]['id']

Theoretically the last if/return can be consolidated into just:
return response['target_key'][0]['id'] if 'target_key' in response else False

Just food for thought :)

I like replacing the last if/else with that super pythonic statement

ryandeivert · 2017-11-15T18:44:52Z

stream_alert/alert_processor/outputs.py

+        """Method to verify the existance of a service with the API
+
+        Args:
+            api_url (str): Base URL of the API


I think this args list needs updated to only include service

@javuto still seeing api_url in this list but this isn't an arg in the function declaration 🤔

ryandeivert · 2017-11-15T18:48:06Z

stream_alert/alert_processor/outputs.py

+        # Extracting context data to assign the incident
+        rule_context = kwargs['alert'].get('context', {})
+        if rule_context:
+            rule_context = rule_context[self.__service__]


Is there a chance that this service won't exist within the rule_context dict? It looks like this is performed without a get so is it safe to assume there will always be a value in it for this service? (I don't think this is the case so just making sure I'm not missing something?)

I presumed that the service will be there, but it is a good idea to make sure and add some checks. Nice catch!

This is the context that is sent from the rule processor, correct? Cause I'm thinking that will only rarely get set in some rules, no?

Yeah that would be the approach. The context element will be there, even if it is just an empty dict but the __service__ could be missing. Better do the check.

coveralls · 2017-11-15T21:53:11Z

Coverage increased (+0.07%) to 95.755% when pulling b7c9903 on javier-streamalert-pagerduty-incidents into 0adf349 on master.

coveralls · 2017-11-15T22:11:38Z

Coverage increased (+0.06%) to 95.746% when pulling c70cf67 on javier-streamalert-pagerduty-incidents into 0adf349 on master.

coveralls · 2017-11-15T23:13:41Z

Coverage increased (+0.06%) to 95.746% when pulling 005524e on javier-streamalert-pagerduty-incidents into 0adf349 on master.

coveralls · 2017-11-16T00:26:14Z

Coverage increased (+0.06%) to 95.746% when pulling 51ac380 on javier-streamalert-pagerduty-incidents into 0adf349 on master.

coveralls · 2017-11-16T00:44:42Z

Coverage increased (+0.06%) to 95.746% when pulling f620ec6 on javier-streamalert-pagerduty-incidents into 4e4e9a8 on master.

ryandeivert

Please address last comment, then LGTM

ryandeivert · 2017-11-16T01:13:50Z

stream_alert/alert_processor/outputs.py

@@ -392,7 +390,7 @@ def dispatch(self, **kwargs):
        # Extracting context data to assign the incident
        rule_context = kwargs['alert'].get('context', {})
        if rule_context:
-            rule_context = rule_context[self.__service__]
+            rule_context = rule_context.get(self.__service__, {})


Could we break out the logic below into a function that returns a tuple of the user_to_assign and the assigned_value? just trying to understand this flow better

coveralls · 2017-11-16T04:31:44Z

Coverage increased (+0.05%) to 95.741% when pulling 84dc0ca on javier-streamalert-pagerduty-incidents into 4e4e9a8 on master.

ryandeivert · 2017-11-16T17:44:37Z

stream_alert/alert_processor/outputs.py

+                return 'assignments', [{'assignee': user_assignee}]
+
+        # If escalation policy was not provided, use default one
+        policy_to_assign = context.get('assigned_policy') or self._escalation_policy


with a lookup like this, you can use a default value with the get like context.get('assigned_policy', self._escalation_policy) instead of doing the ... or ...

coveralls · 2017-11-16T17:50:17Z

Coverage increased (+0.05%) to 95.741% when pulling 8cb7b02 on javier-streamalert-pagerduty-incidents into 4e4e9a8 on master.

coveralls · 2017-11-16T18:30:10Z

Coverage increased (+0.05%) to 95.745% when pulling 056c983 on javier-streamalert-pagerduty-incidents into bc3391c on master.

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from 6c22135 to 0842ddf Compare November 10, 2017 18:25

jacknagz reviewed Nov 10, 2017

View reviewed changes

ryandeivert reviewed Nov 10, 2017

View reviewed changes

ryandeivert added the alerting outputs label Nov 11, 2017

ryandeivert added this to the 1.6.0 milestone Nov 11, 2017

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from 0842ddf to f2ca160 Compare November 13, 2017 23:18

ryandeivert reviewed Nov 14, 2017

View reviewed changes

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from 33c0f77 to eb17826 Compare November 14, 2017 19:01

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from eb17826 to ac63b6b Compare November 15, 2017 16:54

ryandeivert reviewed Nov 15, 2017

View reviewed changes

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from ac63b6b to b7c9903 Compare November 15, 2017 21:50

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from 51ac380 to f620ec6 Compare November 16, 2017 00:40

ryandeivert approved these changes Nov 16, 2017

View reviewed changes

ryandeivert reviewed Nov 16, 2017

View reviewed changes

javier_marcos added 3 commits November 16, 2017 10:26

More tests and addressing comments

e448de6

Addressing comments and more tests

45666de

Splitting dispatch and more tests

ee33239

javier_marcos added 7 commits November 16, 2017 10:26

Refactoring functions and tests

92b141d

Fixed side_effect for tests

2fce8ef

Removing debug code and addressing comments

f086e09

Removing api_url and check __service__

5a0d65f

Adding pagerduty-incident to CLI tests

40b37c7

Function to prepared the incident assignment

48ed05a

Simplifying how to use the default policy

056c983

javuto force-pushed the javier-streamalert-pagerduty-incidents branch from 8cb7b02 to 056c983 Compare November 16, 2017 18:26

javuto merged commit c1b293a into master Nov 16, 2017

javuto deleted the javier-streamalert-pagerduty-incidents branch November 16, 2017 18:31

javuto mentioned this pull request Nov 20, 2017

[output] PagerDuty incident requires "From:" header #493

Merged

javuto mentioned this pull request Nov 27, 2017

[output] Adding support for assign priority to incidents #498

Merged

		@@ -268,6 +268,144 @@ def test_dispatch_bad_descriptor(self, log_error_mock):

		log_error_mock.assert_called_with('Failed to send alert to %s', self.__service)

		class TestPagerDutyIncidentOutput(object):

[output] Adding new output for PagerDuty Incidents #467

[output] Adding new output for PagerDuty Incidents #467

Conversation

javuto commented Nov 10, 2017 • edited Loading

Background

Changes

Testing

coveralls commented Nov 10, 2017

coveralls commented Nov 10, 2017

jacknagz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryandeivert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 13, 2017

Choose a reason for hiding this comment

coveralls commented Nov 14, 2017

coveralls commented Nov 14, 2017

coveralls commented Nov 15, 2017

ryandeivert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 15, 2017

coveralls commented Nov 15, 2017

coveralls commented Nov 15, 2017

coveralls commented Nov 16, 2017

coveralls commented Nov 16, 2017

ryandeivert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 16, 2017

Choose a reason for hiding this comment

coveralls commented Nov 16, 2017

coveralls commented Nov 16, 2017

javuto commented Nov 10, 2017 •

edited

Loading